Genome Sequencing of Omicron Variants of SARS-CoV-2 Circulating in Bangladesh during the Third Wave of the COVID-19 Pandemic

ABSTRACT The coding-complete genome sequence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was obtained from 39 nasopharyngeal swab samples collected in January 2022 from Dhaka, Bangladesh, during the 3rd wave of the COVID-19 pandemic, using Illumina MiniSeq sequencing technology. Sequence analysis showed that all of them belonged to the WHO-designated variant of concern (VOC) Omicron. The presence of different sublineages of Omicron was noted, among which sublineage BA.2 (Nextstrain clade 21L) was the most prevalent one.

S tarting from Wuhan, Hubei Province, China, in late December 2019, the COVID-19 pandemic has devasted the whole world with huge numbers of cases and deaths (1,2). The causative organism severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a member of the Betacoronavirus genus in the Coronaviridae family that also contains other human pathogens, SARS-CoV and MERS-CoV (3,4). In Bangladesh, after a relatively long period of a low number of cases, COVID-19 cases surged rapidly during January 2022 (5). There was a strong demand to know which variants were circulating in the country for public health decision making. In this regard, this study was undertaken.
Nasopharyngeal swab samples were collected from the patients attending the fever clinic of Bangabandhu Sheikh Mujib Medical University, Dhaka, Bangladesh, during January 2022. Real-time reverse transcription-PCR was performed to detect SARS-CoV-2 RNA using the Sansure novel coronavirus (2019-nCoV) nucleic acid diagnostic kit. A total of 39 samples were selected for sequencing (cycle threshold [C T ] of ,25 for both ORF1ab and N gene). In short, viral nucleic acid was extracted using the QIAamp viral RNA minikit (Qiagen, Germany). Firststrand cDNA synthesis was done using the GoScript reverse transcriptase system (Promega Corporation, USA). Enrichment was done using COVIDSeq primer pool 1 and 2 (CPP1 and CPP2, respectively) (6). The library was prepared using Illumina COVIDSeq assay kit according to guidelines (document no. 1000000126053 v07). Prepared libraries were purified and pooled and then quantified with a Qubit double-stranded DNA (dsDNA) high-sensitivity (HS) assay kit. Libraries were normalized to a 4 nM concentration. Pooled normalized libraries were denatured and finally diluted to a 1.2 nM concentration according to the Illumina Miniseq system denature and dilute libraries guide (document no. 1000000002697 v04). The libraries were then loaded and run on an Illumina MiniSeq instrument following the standard protocol for 150-bp paired-end reads. DRAGEN COVID Lineage app v3.5.8 was used to process and assemble the raw reads as well as for variant calling and lineage determination (7).
The study was approved by the institutional review board (IRB) of Bangabandhu Sheikh Mujib Medical University (IRB no. BSMMU/2021/6137). Written informed consent was taken from the patient before data collection. All the information obtained from the study participants was kept confidential.
In total, 39 SARS-CoV-2 genome sequences were obtained. Sequence identifier (ID), GISAID accession no., GenBank accession no., SRA accession no., total genome length, Nextstrain clade, and Pango Lineage for each sample were listed in Table 1. Amino acid substitutions of each sequence can be viewed online using the following link: https://figshare.com/articles/dataset/Table_2_docx/19613880/1. Genome length across the samples was 29,749 to 29,798 bp. Median coverage ranged from 832Â to 2,025Â. All sequences were assigned to WHO-designated variant of concern (VOC) Omicron and Nextstrain clades 21K and 21L. Omicron sublineages BA.1, BA.1.1, BA.1.17, and BA.2 were present, among which BA.2 was the most prevalent one.
Data availability. The coding-complete genome sequences and metadata of all the 39 samples were submitted to the GISAID database (http://www.gisaid.org) where they can be accessed through the accession no. EPI_ISL_10894052 to EPI_ISL_10894090 ( Table 1). The coding-complete genome sequences can be also accessed on the NCBI database (accession no. ON026021 to ON026059, see Table 1). Raw reads were deposited into the SRA database (accession no. were listed in Table 1) under BioProject PRJNA827255.